Distributed asynchronous optimization of convolutional neural networks

نویسندگان

  • William Chan
  • Ian Lane
چکیده

Recently, deep Convolutional Neural Networks have been shown to outperform Deep Neural Networks for acoustic modelling, producing state-of-the-art accuracy in speech recognition tasks. Convolutional models provide increased model robustness through the usage of pooling invariance and weight sharing across spectrum and time. However, training convolutional models is a very computationally expensive optimization procedure, especially when combined with large training corpora. In this paper, we present a novel algorithm for scalable training of deep Convolutional Neural Networks across multiple GPUs. Our distributed asynchronous stochastic gradient descent algorithm incorporates sparse gradients, momentum and gradient decay to accelerate the training of these networks. Our approach is stable, neither requiring warm-starting or excessively large minibatches. Our proposed approach enables convolutional models to be efficiently trained across multiple GPUs, enabling a model to be scaled asynchronously across 5 GPU workers with ≈ 68% efficiency.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Asynchronous Distributed Neural Network Training using Alternating Direction Method of Multipliers

Since the first appearance of a large-scale dataset [4] and powerful computational resources such as GPUs, Convolutional Neural Networks(CNN) became the essential machine learning algorithm for image classification, detection, and many application. As the popularity of CNN increases, the size of CNN increased as well[10, 13]. (For instance, AlexNet has more than 60 milion parameters.) The perfo...

متن کامل

GoSGD: Distributed Optimization for Deep Learning with Gossip Exchange

We address the issue of speeding up the training of convolutional neural networks by studying a distributed method adapted to stochastic gradient descent. Our parallel optimization setup uses several threads, each applying individual gradient descents on a local variable. We propose a new way of sharing information between different threads based on gossip algorithms that show good consensus co...

متن کامل

A multi-scale convolutional neural network for automatic cloud and cloud shadow detection from Gaofen-1 images

The reconstruction of the information contaminated by cloud and cloud shadow is an important step in pre-processing of high-resolution satellite images. The cloud and cloud shadow automatic segmentation could be the first step in the process of reconstructing the information contaminated by cloud and cloud shadow. This stage is a remarkable challenge due to the relatively inefficient performanc...

متن کامل

Estimation of Hand Skeletal Postures by Using Deep Convolutional Neural Networks

Hand posture estimation attracts researchers because of its many applications. Hand posture recognition systems simulate the hand postures by using mathematical algorithms. Convolutional neural networks have provided the best results in the hand posture recognition so far. In this paper, we propose a new method to estimate the hand skeletal posture by using deep convolutional neural networks. T...

متن کامل

Cystoscopy Image Classication Using Deep Convolutional Neural Networks

In the past three decades, the use of smart methods in medical diagnostic systems has attractedthe attention of many researchers. However, no smart activity has been provided in the eld ofmedical image processing for diagnosis of bladder cancer through cystoscopy images despite the highprevalence in the world. In this paper, two well-known convolutional neural networks (CNNs) ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014